1 ID code consistency between .fam and RRB phenotypes

  • The .fam IDs and RRB phenotype IDs are not consistent
  • A match list can be found in microarray/Microarray_clean.txt
  • Then update RRB phenotypes based on .fam IDs


2 Structure of three SSC chips

  • 10220 Individuals from 2591 SSC families were genotyped on three chips.
  • Note that members of each family were analyzed on the same array.

2.1 Illumina 1Mv3

  • 1189 families
  • 4626 people (2703 males, 1923 females)
  • 1199033 SNPs


2.2 Illumina 1Mv1

  • 333 families
  • 1354 people (801 males, 553 females)
  • 1072814 SNPs


2.3 Illumina HumanOmni2.5M

  • 1069 families
  • 4240 people (2490 males, 1750 females)
  • 2440283 SNPs


3 RRB phenotypes

3.1 Phenotype discription

  • phenotype counts and levels
  • primary and secondary variable information for probands

3.1.1 Probands (2588 individuals)


  • 56 phenotypes including age, sex and race
datatable(pro_summs2, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T) )


3.1.2 Unaffected Siblings (2098 individuals)


  • 27 phenotypes including sex
mu2 <- mus[[2]]
datatable(mu2, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T) )


3.1.3 Other Siblings (296 individuals)


  • 26 phenotypes
mu3 <- mus[[3]]
datatable(mu3, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T) )


3.2 Phenotype distributions

3.2.1 Probands

grid.raster(readPNG("figures/ssc_probands_phe.png"))



3.2.2 Unaffected Siblings

grid.raster(readPNG("figures/ssc_unaffSibs_phe.png"))



3.2.3 Other Siblings

grid.raster(readPNG("figures/ssc_otherSibs_phe.png"))



3.3 Shared phenotypes

  • Proband VS Unaffected Siblings: 15
  • Proband VS Other Siblings: 14
  • Unaffected Siblings VS Other Siblings: 26
  • Proband VS Unaffected Siblings VS Other Siblings: 14

3.3.1 Shared phenotype summary

mu <- mus[[1]]
mu2 <- mus[[2]]
mu3 <- mus[[3]]

mu$set <- "Proband"
mu2$set <- "Unaff_Sibs"
mu3$set <- "Other_Sibs"
mu$pheno <- gsub("_Proband", "", mu$Variable)
mu2$pheno <- gsub("_Unaff_Sibs", "", mu2$Variable)
mu3$pheno <- gsub("_Other_Sibs", "", mu3$Variable)
alls <- 
  mu[,c("Counts", "pheno")] %>% full_join(mu2[,c("Counts", "pheno")], by = "pheno") %>%
  full_join(mu3[,c("Counts", "pheno")], by = "pheno")  %>%
  select(pheno, everything())
names(alls) <- c("Variable", "Probands", "Unaffected Siblings", "Other Siblings")

datatable(alls, rownames = FALSE, filter="top", options = list(pageLength = 5, scrollX=T) )


3.3.2 Shared phenotype distribution

3.3.2.1 Probands VS Unaffected Siblings

grid.raster(readPNG("figures/prob_VS_Unaff_sibs_dist.png"))

***


3.3.2.2 Probands VS Other Siblings

grid.raster(readPNG("figures/prob_VS_Other_sibs_dist.png"))


3.3.2.3 Probands VS Siblings

grid.raster(readPNG("figures/prob_VS_sibs_dist.png"))



4 Check sex information

  • using plink --check-sex option based on after-QC genotypes in impute_pipe
  • genotype QC: --geno 0.05 --hwe 1e-6 --mind 0.1 --maf 0.01
  • Note remove PAR regions (CHR 25 and .hh), only CHR 23 is used
  • >0.8 as male (coded as 1), < 0.2 as female (coded as 2)
  • 11 PROBLEM individuals included in RRB phenotype files
    • 10 consistent with phenotype files of unaffected siblings
    • 1 with no sex information from other siblings (11712 4584699075_R01C02 2 0 PROBLEM 0.7319 UCSF_1Mv3 11712.s2)


4.1 Mismatch summary

sex <- read.table("../outputs/sexcheck.txt", header = T, stringsAsFactors = F)
res_sex1 <- sex[sex$chip == "UCSF_1Mv3",]
res_sex2 <- sex[sex$chip == "UCSF_1Mv1",]
res_sex3 <- sex[sex$chip == "UCSF_Omni2.5",]

fams <- read.table("../outputs/fams_covarites.txt", stringsAsFactors = F)
fam1 <- fams[fams$V7 == "1Mv3",]
fam2 <- fams[fams$V7 == "1Mv1",]
fam3 <- fams[fams$V7 == "Omni2.5",]


##confirm with other phenotype file 
pros <- sex[sex$STATUS == "PROBLEM",]
pros$RRB <- "NA"
pros[pros$IID %in% fams$V2,]$RRB <- fams[match(pros[pros$IID %in% fams$V2,]$IID, fams$V2), "V8"]

datatable(pros, rownames = FALSE, filter="top", caption = htmltools::tags$caption(
        style = 'caption-side: top; text-align: center; color:black; 
        font-size:200% ;',"Individuals with discordant sex information"), options = list(pageLength = 5, scrollX=T) )



4.2 Chr-X F distributions

4.2.1 Illumina 1Mv3



  • 31 PROBLEM
grid.raster(readPNG("figures/sex_1mv3.png"))

4.2.2 Illumina 1Mv1



  • 12 PROBLEM
grid.raster(readPNG("figures/sex_1mv1.png"))

4.2.3 Illumina Omni2.5M



  • 9 PROBLEM
grid.raster(readPNG("figures/sex_omni.png"))



5 Pairwise IBD estimation

  • Using plink –genome rel-check based on genome-wide QCd genotypes
  • Genotype QC: –geno 0.05 –hwe 1e-6 –mind 0.1 –maf 0.01; further pruned on the SNPs --indep-pairwise to prune in ~50K SNPs
  • Only individuals within same family is checked
  • Relationships (RT): OT (Parents), FS (Full Siblings), PO (Parent Offspring)

5.1 Estimated pairwise IBD distributions

5.1.1 Illumina 1Mv3

grid.raster(readPNG("figures/ibd_1mv3.png"))



5.1.2 Illumina 1Mv1

grid.raster(readPNG("figures/ibd_1mv1.png"))



5.1.3 Illumina Omni2.5

grid.raster(readPNG("figures/ibd_omni.png"))



5.2 Estimated pairwise IBD VS. Chr-X F

5.2.1 Illumina 1Mv3

grid.raster(readPNG("figures/ibd_sexF_1mv3.png"))



5.2.2 Illumina 1Mv1

grid.raster(readPNG("figures/ibd_sexF_1mv1.png"))



5.2.3 Illumina Omni2.5

grid.raster(readPNG("figures/ibd_sexF_omni.png"))





6 Individual genome-wide heterozygosity

  • Using --het to calculate genome-wide (using pruned SNPs) heterozygosity
  • Mean heterozygosity = (N-O/N); Het_mean <- (N.NM. - O.HOM.)/N.NM.
  • Using --missing to calculate missing rates (individuals with missing rates > 0.1 will be removed)

6.1 Genome-wide heterozygosity VS missing rates

grid.raster(readPNG("figures/p_het.png"))

grid.raster(readPNG("figures/p_F.png"))



  • Grey horizontal line is y = mean +/- 3SD
  • Red horizontal line is y = mean +/- 5SD
  • Grey vertical line is x = 0.1


6.2 Genome-wide heterozygosity VS IBD estimation (PI_HAT)

6.2.1 Illumina 1Mv3

grid.raster(readPNG("figures/p_het_ibd1.png"))



6.2.2 Illumina 1Mv1

grid.raster(readPNG("figures/p_het_ibd2.png"))



6.2.3 Illumina Omni2.5

grid.raster(readPNG("figures/p_het_ibd3.png"))